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after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 
• Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 

Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1 .704(b). 

Status 

1 )S Responsive to communication(s) filed on 16 March 2005 . 
2a)KI This action is FINAL. 2b)Q This action is non-final. 

3) Q Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 213. 

Disposition of Claims 

4) E3 Claim(s) 1-22 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) D Claim(s) 1 .2.7.8 and 13-22 is/are rejected. 

7) ^ Claim(s) 3-6 and 9-12 is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10)Q The drawing(s) filed on is/are: a)D accepted or b)Q objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 

Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
1 1 )□ The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-1 52. 

Priority under 35 U.S.C. § 119 

12)D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 119(a)-(d) or (f). 
a)D All b)D Some * c)Q None of: 

1 0 Certified copies of the priority documents have been received. 

20 Certified copies of the priority documents have been received in Application No. . 



30 Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 



Attachment(s) 

1) S Notice of References Cited (PTO-892) 

2) O Notice of Draftsperson's Patent Drawing Review (PTO-948) 

3) □ Information Disclosure Statement(s) (PTO-1449 or PTO/SB/08) 

Paper No(s)/Mail Date . 



4) □ Interview Summary (PTO-413) 

Paper No(s)/Mail Date. . 

5) □ Notice of Informal Patent Application (PTO-1 52) 

6) □ Other: . 



U.S. Patent and Trademark Office 
PTOL-326 (Rev. 1-04) 



Office Action Summary 



Part of Paper No./Mail Date 05232005 



Application/Control Number: 09/595,71 9 Page 2 

Art Unit: 2178 

DETAILED ACTION 

1 . This action is responsive to communications: amendment filed 3/1 6/05 to the 
application filed on 6/16/00. 

2. Claims 1-22 are pending in the case. Claims 1, 7, 14, 16, 18 are independent 
claims. 

3. The objection of claim 1 8 has been withdrawn in view of the amendment. 

Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in a patent granted on an application for patent by another filed in the 
United States before the invention thereof by the applicant for patent, or on an international application 
by another who has fulfilled the requirements of paragraphs (1 ), (2), and (4) of section 371 (c) of this 
title before the invention thereof by the applicant for patent. 

The changes made to 35 U.S.C. 102(e) by the American Inventors Protection Act 
of 1999 (AIPA) and the Intellectual Property and High Technology Technical 
Amendments Act of 2002 do not apply when the reference is a U.S. patent resulting 
directly or indirectly from an international application filed before November 29, 2000. 
Therefore, the prior art date of the reference is determined under 35 U.S.C. 102(e) prior 
to the amendment by the AIPA (pre-AlPA 35 U.S.C. 102(e)). 
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5. Claims 7-8 remain rejected under 35 U.S.C. 102(e) as being anticipated by 
Papakonstantinou et al, DTD Inference for views of XML data, ACM May 2000, pages 
35-46. 

Regarding independent claim 7, Papakonstatinou discloses: 

- generalizing input sequences associated with a document to develop general 
sequences, said input sequences reflecting the structure of a document (pages 
35-36: ".. XML marks the 'return of the schema 1 (albeit loose and flexible) in 
semistructured data, in the form of its Data Type Definition (DTDs) ... DTDs 
describes the structure of the objects (elements) participating in an XML 
document "... variable bindings extracted by the tree pattern ... extract from 
the input the list of subtrees ... the generalization to multiple sources is 
straightforward, since these can be viewed as one source ....") 

- selecting a document descriptor from said input sequences, said general 
sequences where said factored sequences using minimum descriptor length 
(MDL) principles (pages 35-36: "... variable bindings extracted by the tree 
pattern . . . extract from the input the list of subtrees to which one of the variables 
in the tree pattern binds ... constructing a tight ltd for the view, i.e. an ltd that 
precisely characterizes the type structures of trees ... we overcome these 
limitations by enhancing ltds with a simple subtypying mechanism ...specialized 
ltds encompass the expressive power of formalism.. "; the fact that the 
specialized ltds are simple, precisely characterizes the type structure of trees and 
encompass the data structure implies that the ltds, which is the DTDs, are 
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selected from the tag sequence using the minimum descriptor length; page 36: 
"... the 'pattern' ... the limited from of inference can be accomplished by inferring 
the pattern that view variables may bind to ...") 



Regarding claim 8, which is dependent on claim 7, Papakonstatinou discloses: 

- encoding said input sequences, said general sequences, and said factored 
sequences (pages 35-38: the tags of the document, which are the sequences, 
are encoded data) 

- selecting a document descriptor which encompasses all of said input sequences 
and exhibits a minimum MDL cost (pages 35-36: "... variable bindings extracted 
by the tree pattern ... extract from the input the list of subtrees to which one of 
the variables in the tree pattern binds ... constructing a tight ltd for the view, i.e. 
an ltd that precisely characterizes the type structures of trees ... we overcome 
these limitations by enhancing ltds with a simple subtypying mechanism 
...specialized ltds encompass the expressive power of formalism.. "; the fact that 
the specialized ltds are simple, precisely characterizes the type structure of trees 
and encompass the data structure implies that the ltds, which is the DTDs, are 
selected from the tag sequence using the minimum descriptor length, and thus 
have the minimum cost) 
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6. Claims 16-19 remain rejected under 35 U.S.C. 102(e) as being anticipated by 
Moh et al., Re-engineering Structures from Web Documents, ACM June 2, 2000, pages 
67-76. 

Regarding independent claim 16, Moh discloses: 

- discovering OR patterns among said input sequences (pages 69, 73-74) 

- discovering sequence patterns among said input sequences and OR patterns 
(pages 74-75) 

Regarding claim 17, which is dependent on claim 16, Moh discloses that discovering 
OR patterns comprises partitioning said input sequences (page 73). 

Regarding independent claim 18, Moh discloses: 
Generalizing input sequences, said generalizing comprises: 

- discovering OR patterns among said input sequences (pages 69, 74) 

- discovering sequence patterns among said input sequences and OR patterns 
(pages 74-75) 

Selecting a document descriptor from said input sequence and said general sequences 
(page 72, Final Construction of DTD). 

Regarding claim 19, which is dependent on claim 18, Moh discloses that discovering 
OR patterns comprises partitioning said input sequences (page 73). 
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Claims 14-15 are for a computer readable medium of method claims 16-17, 18-19, and 
are rejected under the same rationale. 

Claim Rejections - 35 USC § 103 

7. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

8. This application currently names joint inventors. In considering patentability of 
the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of 
the various claims was commonly owned at the time any inventions covered therein 
were made absent any evidence to the contrary. Applicant is advised of the obligation 
under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 
not commonly owned at the time a later invention was made in order for the examiner to 
consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) 
prior art under 35 U.S.C. 103(a). 

9. Claims 1-2 remain rejected under 35 U.S.C. 103(a) as being unpatentable over 
Papakonstatinou et al., DTD Inference for Views of XML Data, ACM May 2000, pages 
35-46. 

Regarding independent claim 1 , Papakonstatinou discloses: 
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- generalizing input sequences associated with a document to develop general 
sequences, said input sequences reflecting the structure of a document (pages 
35-36: ".. XML marks the 'return of the schema 1 (albeit loose and flexible) in 
semistructured data, in the form of its Data Type Definition (DTDs) ... DTDs 
describes the structure of the objects (elements) participating in an XML 
document "... variable bindings extracted by the tree pattern ... extract from 
the input the list of subtrees ... the generalization to multiple sources is 
straightforward, since these can be viewed as one source ....") 

- selecting a document descriptor from said input sequences, said general 
sequences where said factored sequences using minimum descriptor length 
(MDL) principles (pages 35-36: "... variable bindings extracted by the tree 
pattern ... extract from the input the list of subtrees to which one of the variables 
in the tree pattern binds ... constructing a tight ltd for the view, i.e. an ltd that 
precisely characterizes the type structures of trees ... we overcome these 
limitations by enhancing ltds with a simple subtypying mechanism ...specialized 
ltds encompass the expressive power of formalism.. "; the fact that the 
specialized ltds are simple, precisely characterizes the type structure of trees and 
encompass the data structure implies that the ltds, which is the DTDs, are 
selected from the tag sequence using the minimum descriptor length; page 36: 
"... the 'pattern' ... the limited from of inference can be accomplished by inferring 
the pattern that view variables may bind to ..") 
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. - grouping the tags where the tags showing the input sequence of the structure of 
the document (pages 37-39, Example 2.2, Example 2.7, Example 2.13) 
Papakonstatinou does not explicitly disclose factoring said input sequences and said 
general sequences to develop factored sequences. 

However, it would have been obvious to one of ordinary skill in the art at the time of the 
invention was made to have modified Papakonstatinou to include factoring the input 
sequence and said general sequences to develop factored sequences for the following 
reason. The grouping of tag sequences in Papakonstatinou, suggests that the tag 
sequence of a document in Papakonstatinou, while simple and encompass the 
formalism of data suggests that the tag names of the same types are grouped together 
for a precise DTD with a shortest length. 

Regarding claim 2, which is dependent on claim 1, Papakonstatinou discloses: 

- encoding said input sequences, said general sequences, and said factored 
sequences (pages 35-38: the tags of the document, which are the sequences, 
are encoded data) 

- selecting a document descriptor which encompasses all of said input sequences 
and exhibits a minimum MDL cost (pages 35-36: "... variable bindings extracted 
by the tree pattern ... extract from the input the list of subtrees to which one of 
the variables in the tree pattern binds ... constructing a tight ltd for the view, i.e. 
an ltd that precisely characterizes the type structures of trees ... we overcome 
these limitations by enhancing ltds with a simple subtypying mechanism 
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...specialized ltds encompass the expressive power of formalism.. "; the fact that 
the specialized ltds are simple, precisely characterizes the type structure of trees 
and encompass the data structure implies that the ltds, which is the DTDs, are 
selected from the tag sequence using the minimum descriptor length, and thus 
have the minimum cost) 

10. Claims 1-2, 7-8, 13, 20-22 remain rejected under 35 U.S.C. 103(a) as being 
unpatentable over Moh et al. Re-engineering Structures from Web Documents, ACM, 
June 2, 2000, pages 67-76. 

Regarding independent claim 1 , Moh discloses: 

- generalizing input sequences associated with a document to develop general 
sequences, said input sequences reflecting the structure of a document (page 
74: the sequence of a document is generalized) 

- selecting a document descriptor from said input sequences, said general 
sequences, and said factored sequences using minimum descriptor length (MDL) 
principles (pages 74-76: the document DTD is derived from the sequence of 
document elements to reduce the repeated elements and thus providing a DTD 
with minimum descriptor length) 

Moh does not disclose factoring said input sequences and said general sequences to 
develop factored sequences. 

However, Moh does teach structural clustering of document tags (page 69). 
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It would have been obvious to one of ordinary skill in the art at the time of the invention 
was made to have modified Moh to include the factoring step into Moh since clustering 
the structure of a document via the input sequence of the document tags in Moh 
suggests the repeated tags be clustered, which means be grouped together to form a 
short sequence. It was well known in the art to cluster the repeated items such as the 
same elements in a web page, or documents of the same topic to form a collection. 
The combination of factoring step into Moh would help to accurately derive a precise 
DTD for a document collection. 

Regarding claim 2, which is dependent on claim 1, Moh discloses: 

- encoding said input sequences, said general sequences, and said factored 
sequences (pages 74-75: document tags are encoded) 

- selecting a document descriptor which encompasses all of said input sequence, 
and exhibits a minimum MDL cost (pages 74-76) 

Claims 7-8 include the limitations of claims 1-2, and are rejected under the same 
rationale. 

Regarding claim 13, which is dependent on claim 7, Moh does not disclose explicitly 
that factoring said input sequences and said general sequences to develop factored 
sequences, wherein said factored sequences are available to said selecting. 
However, Moh does teach structural clustering of document tags (page 69). 
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It would have been obvious to one of ordinary skill in the art at the time of the invention 
was made to have modified Moh to include the factoring step into Moh since clustering 
the structure of a document via the input sequence of the document tags in Moh 
suggests the repeated tags be clustered, which means be grouped together to form a 
short sequence. It was well known in the art to cluster the repeated items such as the 
same elements in a web page, or documents of the same topic to form a collection. 
Therefore, the combination of factoring step into Moh would help to accurately derive a 
concise and precise DTD for a document collection. 

Regarding claim 20, which is dependent on claim 19, Moh does not disclose explicitly 
that factoring said input sequences and said general sequences to develop factored 
sequences, wherein said factored sequences are available to said selecting. 
However, Moh does teach structural clustering of document tags (page 69). 
It would have been obvious to one of ordinary skill in the art at the time of the invention 
was made to have modified Moh to include the factoring step into Moh since clustering 
the structure of a document via the input sequence of the document tags in Moh 
suggests the repeated tags be clustered, which means be grouped together to form a 
short sequence. It was well known in the art to cluster the repeated items such as the 
same elements in a web page, or documents of the same topic to form a collection. 
Therefore, the combination of factoring step into Moh would help to accurately derive a 
concise and precise DTD for a document collection. 
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Regarding claim 21 , which is dependent on claim 20, Moh does not disclose explicitly 
that said selecting employs minimum descriptor length (MDL) principles (page 71). 

Regarding claim 22, which is dependent on claim 21, Moh discloses that said document 
descriptor is a document type descriptor (DTD) and said document is an extensible 
Markup Language (XML) document (pages 67-75). 

Allowable Subject Matter 

11. Claims 3-6, 9-12 are objected to as being dependent upon a rejected base claim, 
but would be allowable if rewritten in independent form including all of the limitations of 
the base claim and any intervening claims. 

Response to Amendment 

12. The declaration filed on 3/16/05 under 37 CFR 1.131 has been considered but is 
ineffective to overcome the Moh and Papakonstantinou references. 

The declarations were not signed by all the co-inventors and a petition under 37 CFR 
1 .47 in this application was not granted. MPEP 715.04 (A-D) states: 

"The following parties may make an affidavit or declaration under 37 CFR 1. 131: 

(A) All the inventors of the subject matter claimed. 

(B) An affidavit or declaration by less than all named inventors of an application is 
accepted where it is shown that less than all named inventors of an application 
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invented the subject matter of the claim or claims under rejection. For example, 
one of two joint inventors is accepted where it is shown that one of the joint 
inventors is the sole inventor of the claim or claims under rejection. 

(C) **> If a petition under 37 CFR 1.47 was granted or the application was 
accepted under 37 CFR 1.42 or 1.43, the affidavit or declaration may be signed 
by the 37 CFR 1.47 applicant or the legal representative, where appropriate. < . 

(D) The assignee or other party in interest when it is not possible to produce the 
affidavit or declaration of the inventor. Ex parte Foster, 1903 CD. 213, 

105 O. G. 261 (Comm'r Pat. 1903). 

Affidavits or declarations to overcome a rejection of a claim or claims must be made 
by the inventor or inventors of the subject matter of the rejected claim(s), a party 
qualified under 37 CFR 1.42, 1.43, or 1.47, or the assignee or other party in interest 
when it is not possible to produce the affidavit or declaration of the inventor(s). Thus, 
where all of the named inventors of a pending application are not inventors of every 
claim of the application, any affidavit under 37 CFR 1.131 could be signed by only the 
inventor(s) of the subject matter of the rejected claims. Further, where it is shown that a 
joint inventor is deceased, refuses to sign, oris otherwise unavailable, the signatures of 
the remaining joint inventors are sufficient. However, the affidavit or declaration, even 
though signed by fewer than all the joint inventors, must show completion of the 
invention by all of the joint inventors of the subject matter of the claim(s) under rejection. 
In re Carlson, 79 F.2d 900, 27 USPQ 400 (CCPA 1935)" 

As noted in (A) above, all inventors must sign the declaration under 37 CFR 1 .131 or a 

petition under 37 CFR 1 .47 is required, 37 CFR 1 .47 states in part: 
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"The oath or declaration in such an application must be accompanied by a petition 
including proof of the pertinent facts, the fee set forth in § 1.17(h), and the last known 
address of any nonsigning inventor" 
Applicants should consider filing the declarations and petition by the assignee of the 
invention, in the case where it is not possible to have the signatures of all the co- 
inventors. See MPEP 715.04(D). 

It is unclear what Applicants would like to establish: 1) conception coupled with due 
diligence or 2) reduction to practice prior to the effective date of the reference in the 
submitted declaration, since both are mentioned in the declaration and in Applicants' 
remarks. Applicants should consider filing a proper declaration in light of MPEP 715.07 
(III) and MPEP 715.07 (a). 

In any case, Exhibits A, B, and C, submitted as a written description of the invention, do 
not constitute an actual reduction to practice or establish conception coupled with due 
diligence. Furthermore, only the filling of a US patent application which complies with 
the disclosure requirement of 35 USC 112 constitutes a constructive reduction to 
practice. A written description, no matter how complete, which has not been made the 
subject of a US patent application, does not qualify as reduction to practice or 
conception coupled with due diligence. 

Accordingly, Applicants have not established prior invention. The rejection is 
maintained. 



1 
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Conclusion 

13. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 

§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

14. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Nussbaum et al. (US Pat No. 6,779,154 B1, 8/17/04, filed 2/1/00). 

Munetsugu et al. (US Pat App Pub No. 2004/0133569 A1, 7/8/04, filed 12/11/03, priority 

12/20/99). 
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15. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Cong-Lac Huynh whose telephone number is 571-272- 
4125. The examiner can normally be reached on Mon-Fri (8:30-6:00). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Stephen Hong can be reached on 571-272-4124. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-4125. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



Cong-Lac Huynh 
Examiner 
Art Unit 21 78 
5/23/05 




